Amoeba: A Shape changing Storage System for Big Data
نویسندگان
چکیده
Data partitioning significantly improves the query performance in distributed database systems. A large number of techniques have been proposed to efficiently partition a dataset for a given query workload. However, many modern analytic applications involve ad-hoc or exploratory analysis where users do not have a representative query workload upfront. Furthermore, workloads change over time as businesses evolve or as analysts gain better understanding of their data. Static workload-based data partitioning techniques are therefore not suitable for such settings. In this paper, we describe the demonstration of Amoeba, a distributed storage system which uses adaptive multi-attribute data partitioning to efficiently support ad-hoc as well as recurring queries. Amoeba applies a robust partitioning algorithm such that ad-hoc queries on all attributes have similar performance gains. Thereafter, Amoeba adaptively repartitions the data based on the observed query sequence, i.e., the system improves over time. All along Amoeba offers both adaptivity (i.e., adjustments according to workload changes) as well as robustness (i.e., avoiding performance spikes due to workload changes). We propose to demonstrate Amoeba on scenarios from an internet-ofthings startup that tracks user driving patterns. We invite the audience to interactively fire fast ad-hoc queries, observe multi-dimensional adaptivity, and play with a robust/reactive knob in Amoeba. The web front end displays the layout changes, runtime costs, and compares it to Spark with both default and workload-aware partitioning.
منابع مشابه
Cloud Computing Technology Algorithms Capabilities in Managing and Processing Big Data in Business Organizations: MapReduce, Hadoop, Parallel Programming
The objective of this study is to verify the importance of the capabilities of cloud computing services in managing and analyzing big data in business organizations because the rapid development in the use of information technology in general and network technology in particular, has led to the trend of many organizations to make their applications available for use via electronic platforms hos...
متن کاملDesign and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملAn experimental investigation on the energy storage in a shape-memory-polymer system
In this paper, the effect of thermomechanical loading on the behavior of deflection-based harvested energies from a shape memory polymer system is experimentally investigated. Samples are created with honeycomb cells from poly-lactic acid using additive manufacturing techniques. The shape memory effect in shape recovery and force recovery paths are studied under thermomechanical tests in bendin...
متن کاملAn Efficient Secret Sharing-based Storage System for Cloud-based Internet of Things
Internet of things (IoTs) is the newfound information architecture based on the internet that develops interactions between objects and services in a secure and reliable environment. As the availability of many smart devices rises, secure and scalable mass storage systems for aggregate data is required in IoTs applications. In this paper, we propose a new method for storing aggregate data in Io...
متن کاملSystem Framework for Cardiovascular Disease Prediction Based on Big Data Technology
Amid growing concern over the changing climate, environment, and health care, the interconnectivity between cardiovascular diseases, coupled with rapid industrialization, and a variety of environmental factors, has been the focus of recent research. It is necessary to research risk factor extraction techniques that consider individual external factors and predict diseases and conditions. Theref...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- PVLDB
دوره 9 شماره
صفحات -
تاریخ انتشار 2016